topological similarity
Skeletonization Quality Evaluation: Geometric Metrics for Point Cloud Analysis in Robotics
Wen, Qingmeng, Lai, Yu-Kun, Ji, Ze, Tafrishi, Seyed Amir
Skeletonization is a powerful tool for shape analysis, rooted in the inherent instinct to understand an object's morphology. It has found applications across various domains, including robotics. Although skeletonization algorithms have been studied in recent years, their performance is rarely quantified with detailed numerical evaluations. This work focuses on defining and quantifying geometric properties to systematically score the skeletonization results of point cloud shapes across multiple aspects, including topological similarity, boundedness, centeredness, and smoothness. We introduce these representative metric definitions along with a numerical scoring framework to analyze skeletonization outcomes concerning point cloud data for different scenarios, from object manipulation to mobile robot navigation. Additionally, we provide an open-source tool to enable the research community to evaluate and refine their skeleton models. Finally, we assess the performance and sensitivity of the proposed geometric evaluation methods from various robotic applications.
A Complexity-Based Theory of Compositionality
Elmoznino, Eric, Jiralerspong, Thomas, Bengio, Yoshua, Lajoie, Guillaume
Compositionality is believed to be fundamental to intelligence. In humans, it underlies the structure of thought, language, and higher-level reasoning. In AI, compositional representations can enable a powerful form of out-of-distribution generalization, in which a model systematically adapts to novel combinations of known concepts. However, while we have strong intuitions about what compositionality is, there currently exists no formal definition for it that is measurable and mathematical. Here, we propose such a definition, which we call representational compositionality, that accounts for and extends our intuitions about compositionality. The definition is conceptually simple, quantitative, grounded in algorithmic information theory, and applicable to any representation. Intuitively, representational compositionality states that a compositional representation satisfies three properties. First, it must be expressive. Second, it must be possible to re-describe the representation as a function of discrete symbolic sequences with re-combinable parts, analogous to sentences in natural language. Third, the function that relates these symbolic sequences to the representation, analogous to semantics in natural language, must be simple. Through experiments on both synthetic and real world data, we validate our definition of compositionality and show how it unifies disparate intuitions from across the literature in both AI and cognitive science. We also show that representational compositionality, while theoretically intractable, can be readily estimated using standard deep learning tools. Our definition has the potential to inspire the design of novel, theoretically-driven models that better capture the mechanisms of compositional thought.
Simplifying complex machine learning by linearly separable network embedding spaces
Xenos, Alexandros, Dognin, Noel-Malod, Przulj, Natasa
Low-dimensional embeddings are a cornerstone in the modelling and analysis of complex networks. However, most existing approaches for mining network embedding spaces rely on computationally intensive machine learning systems to facilitate downstream tasks. In the field of NLP, word embedding spaces capture semantic relationships \textit{linearly}, allowing for information retrieval using \textit{simple linear operations} on word embedding vectors. Here, we demonstrate that there are structural properties of network data that yields this linearity. We show that the more homophilic the network representation, the more linearly separable the corresponding network embedding space, yielding better downstream analysis results. Hence, we introduce novel graphlet-based methods enabling embedding of networks into more linearly separable spaces, allowing for their better mining. Our fundamental insights into the structure of network data that enable their \textit{\textbf{linear}} mining and exploitation enable the ML community to build upon, towards efficiently and explainably mining of the complex network data.
Understanding Simplicity Bias towards Compositional Mappings via Learning Dynamics
Ren, Yi, Sutherland, Danica J.
Obtaining compositional mappings is important for the model to generalize well compositionally. To better understand when and how to encourage the model to learn such mappings, we study their uniqueness through different perspectives. Specifically, we first show that the compositional mappings are the simplest bijections through the lens of coding length (i.e., an upper bound of their Kolmogorov complexity). This property explains why models having such mappings can generalize well. We further show that the simplicity bias is usually an intrinsic property of neural network training via gradient descent. That partially explains why some models spontaneously generalize well when they are trained appropriately.
On Class Separability Pitfalls In Audio-Text Contrastive Zero-Shot Learning
Tavares, Tiago, Ayres, Fabio, Wang, Zhepei, Smaragdis, Paris
Recent advances in audio-text cross-modal contrastive learning have shown its potential towards zero-shot learning. One possibility for this is by projecting item embeddings from pre-trained backbone neural networks into a cross-modal space in which item similarity can be calculated in either domain. This process relies on a strong unimodal pre-training of the backbone networks, and on a data-intensive training task for the projectors. These two processes can be biased by unintentional data leakage, which can arise from using supervised learning in pre-training or from inadvertently training the cross-modal projection using labels from the zero-shot learning evaluation. In this study, we show that a significant part of the measured zero-shot learning accuracy is due to strengths inherited from the audio and text backbones, that is, they are not learned in the cross-modal domain and are not transferred from one modality to another.
Deep Regression Representation Learning with Topology
Zhang, Shihao, kawaguchi, kenji, Yao, Angela
Most works studying representation learning focus only on classification and neglect regression. Yet, the learning objectives and, therefore, the representation topologies of the two tasks are fundamentally different: classification targets class separation, leading to disconnected representations, whereas regression requires ordinality with respect to the target, leading to continuous representations. We thus wonder how the effectiveness of a regression representation is influenced by its topology, with evaluation based on the Information Bottleneck (IB) principle. The IB principle is an important framework that provides principles for learning effective representations. We establish two connections between it and the topology of regression representations. The first connection reveals that a lower intrinsic dimension of the feature space implies a reduced complexity of the representation Z. This complexity can be quantified as the conditional entropy of Z on the target Y, and serves as an upper bound on the generalization error. The second connection suggests a feature space that is topologically similar to the target space will better align with the IB principle. Based on these two connections, we introduce PH-Reg, a regularizer specific to regression that matches the intrinsic dimension and topology of the feature space with the target space. Experiments on synthetic and real-world regression tasks demonstrate the benefits of PH-Reg. Code: https://github.com/needylove/PH-Reg.
Graph Embedding via Diffusion-Wavelets-Based Node Feature Distribution Characterization
Wang, Lili, Huang, Chenghan, Ma, Weicheng, Cao, Xinyuan, Vosoughi, Soroush
Recent years have seen a rise in the development of representational learning methods for graph data. Most of these methods, however, focus on node-level representation learning at various scales (e.g., microscopic, mesoscopic, and macroscopic node embedding). In comparison, methods for representation learning on whole graphs are currently relatively sparse. In this paper, we propose a novel unsupervised whole graph embedding method. Our method uses spectral graph wavelets to capture topological similarities on each k-hop sub-graph between nodes and uses them to learn embeddings for the whole graph. We evaluate our method against 12 well-known baselines on 4 real-world datasets and show that our method achieves the best performance across all experiments, outperforming the current state-of-the-art by a considerable margin.
Inter-layer Information Similarity Assessment of Deep Neural Networks Via Topological Similarity and Persistence Analysis of Data Neighbour Dynamics
Hryniowski, Andrew, Wong, Alexander
The quantitative analysis of information structure through a deep neural network (DNN) can unveil new insights into the theoretical performance of DNN architectures. Two very promising avenues of research towards quantitative information structure analysis are: 1) layer similarity (LS) strategies focused on the inter-layer feature similarity, and 2) intrinsic dimensionality (ID) strategies focused on layer-wise data dimensionality using pairwise information. Inspired by both LS and ID strategies for quantitative information structure analysis, we introduce two novel complimentary methods for inter-layer information similarity assessment premised on the interesting idea of studying a data sample's neighbourhood dynamics as it traverses through a DNN. More specifically, we introduce the concept of Nearest Neighbour Topological Similarity (NNTS) for quantifying the information topology similarity between layers of a DNN. Furthermore, we introduce the concept of Nearest Neighbour Topological Persistence (NNTP) for quantifying the inter-layer persistence of data neighbourhood relationships throughout a DNN. The proposed strategies facilitate the efficient inter-layer information similarity assessment by leveraging only local topological information, and we demonstrate their efficacy in this study by performing analysis on a deep convolutional neural network architecture on image data to study the insights that can be gained with respect to the theoretical performance of a DNN.
Evaluating the Disentanglement of Deep Generative Models through Manifold Topology
Zhou, Sharon, Zelikman, Eric, Lu, Fred, Ng, Andrew Y., Carlsson, Gunnar, Ermon, Stefano
Learning disentangled representations is regarded as a fundamental task for improving the generalization, robustness, and interpretability of generative models. However, measuring disentanglement has been challenging and inconsistent, often dependent on an ad-hoc external model or specific to a certain dataset. To address this, we present a method for quantifying disentanglement that only uses the generative model, by measuring the topological similarity of conditional submanifolds in the learned representation. To illustrate the effectiveness and applicability of our method, we empirically evaluate several state-of-the-art models across multiple datasets. We find that our method ranks models similarly to existing methods. Figure 1: Factors in the dSprites dataset displaying topological similarity and semantic correspondence to respective latent dimensions in a disentangled generative model, as shown through Wasserstein RLT distributions of homology and latent interpolations along respective dimensions. Learning disentangled representations is important for a variety of tasks, including adversarial robustness, generalization to novel tasks, and interpretability (Stutz et al., 2019; Alemi et al., 2017; Ridgeway, 2016; Bengio et al., 2013). Recently, deep generative models have shown marked improvement in disentanglement across an increasing number of datasets and a variety of training objectives (Chen et al., 2016; Lin et al., 2020; Higgins et al., 2017; Kim and Mnih, 2018; Chen et al., 2018b; Burgess et al., 2018; Karras et al., 2019). Nevertheless, quantifying the extent of this disentanglement has remained challenging and inconsistent.
Compositional Languages Emerge in a Neural Iterated Learning Model
Ren, Yi, Guo, Shangmin, Labeau, Matthieu, Cohen, Shay B., Kirby, Simon
The principle of compositionality, which enables natural language to represent complex concepts via a structured combination of simpler ones, allows us to convey an open-ended set of messages using a limited vocabulary. If compositionality is indeed a natural property of language, we may expect it to appear in communication protocols that are created by neural agents in language games. In this paper, we propose an effective neural iterated learning (NIL) algorithm that, when applied to interacting neural agents, facilitates the emergence of a more structured type of language. Indeed, these languages provide learning speed advantages to neural agents during training, which can be incrementally amplified via NIL. We provide a probabilistic model of NIL and an explanation of why the advantage of compositional language exist. Our experiments confirm our analysis, and also demonstrate that the emerged languages largely improve the generalizing power of the neural agent communication.